Automated SNP Genotype Clustering Algorithm to Improve Data Completeness in High-Throughput SNP Genotyping Datasets from Custom Arrays

نویسندگان

  • Edward M. Smith
  • Jack Littrell
  • Michael Olivier
چکیده

High-throughput SNP genotyping platforms use automated genotype calling algorithms to assign genotypes. While these algorithms work efficiently for individual platforms, they are not compatible with other platforms, and have individual biases that result in missed genotype calls. Here we present data on the use of a second complementary SNP genotype clustering algorithm. The algorithm was originally designed for individual fluorescent SNP genotyping assays, and has been optimized to permit the clustering of large datasets generated from custom-designed Affymetrix SNP panels. In an analysis of data from a 3K array genotyped on 1,560 samples, the additional analysis increased the overall number of genotypes by over 45,000, significantly improving the completeness of the experimental data. This analysis suggests that the use of multiple genotype calling algorithms may be advisable in high-throughput SNP genotyping experiments. The software is written in Perl and is available from the corresponding author.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A multi-array multi-SNP genotyping algorithm for Affymetrix SNP microarrays

MOTIVATION Modern strategies for mapping disease loci require efficient genotyping of a large number of known polymorphic sites in the genome. The sensitive and high-throughput nature of hybridization-based DNA microarray technology provides an ideal platform for such an application by interrogating up to hundreds of thousands of single nucleotide polymorphisms (SNPs) in a single assay. Similar...

متن کامل

Model Based Probe Fitting and Selection for SNP Array

Recent advances of high-throughput SNP arrays such as Affymetrix’s GeneChip Human Mapping 500K array set have made it possible to genotype large samples in a fast and cheap manner. A lot of algorithms were developed to call the genotypes from SNP array. When considering the low level preprocessing of SNP array, most algorithms just borrow the techniques from the gene expression microarray. As i...

متن کامل

ALCHEMY: a reliable method for automated SNP genotype calling for small batch sizes and highly homozygous populations

MOTIVATION The development of new high-throughput genotyping products requires a significant investment in testing and training samples to evaluate and optimize the product before it can be used reliably on new samples. One reason for this is current methods for automated calling of genotypes are based on clustering approaches which require a large number of samples to be analyzed simultaneousl...

متن کامل

MACGT: multi-dimensional automated clustering genotyping tool for analysis of microarray-based mini-sequencing data

SUMMARY Multi-dimensional Automated Clustering Genotyping Tool (MACGT) is a Java application that clusters complex multi-dimensional vector data derived from single nucleotide polymorphism (SNP) genotyping experiments using mini-sequencing based microarray chemistries such as arrayed primer extension (APEX). Spot intensity output files from microarray experiments across multiple samples are imp...

متن کامل

Linear Clustering with Application to Single Nucleotide Polymorphism Genotyping

Single nucleotide polymorphisms (SNPs) have been increasingly popular for a wide range of genetic studies. A high-throughput genotyping technologies usually involves a statistical genotype calling algorithm. Most calling algorithms in the literature, using methods such as k-means and mixturemodels, rely on elliptical structures of the genotyping data; they may fail when the minor allele homozyg...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2007